Elliptic curve only hash

Elliptic curve only hash (ECOH)
General
Designers Daniel R. L. Brown, Matt Campagna, Rene Struik
First published 2008
Derived from MuHASH
Detail
Digest sizes 224, 256, 384 or 512
Best public cryptanalysis
Second Pre-Image

The elliptic curve only hash (ECOH) algorithm was submitted as a candidate for SHA-3 in the NIST hash function competition. However, it was rejected in the beginning of the competition since a second pre-image attack was found.

The ECOH is based on the MuHASH hash algorithm, that has not yet been successfully attacked. However, MuHASH is too inefficient for practical use and changes had to be made. The main difference is that where MuHASH applies a random oracle, ECOH applies a padding function. Assuming random oracles, finding a collision in MuHASH implies solving the discrete logarithm problem. MuHASH is thus a provably secure hash, i.e. we know that finding a collision is at least as hard as some hard known mathematical problem.

ECOH does not use random oracles and its security is not strictly directly related to the discrete logarithm problem, yet it is still based on mathematical functions. ECOH is related to the Semaev's problem of finding low degree solutions to the summation polynomial equations over binary field, called the Summation Polynomial Problem. An efficient algorithm to solve this problem has not been given so far. Although the problem was not proven to be NP-hard, it is assumed that such an algorithm does not exist. Under certain assumptions, finding a collision in ECOH may be also viewed as an instance of the subset sum problem. Besides solving the Summation Polynomial Problem, there exists another way how to find second pre-images and thus collisions, Wagner's generalized birthday attack.

ECOH is a nice example of hash function that is based on mathematical functions (with the provable security approach) rather than on classical "ad-hoc" mixing of bits to obtain the hash.

Contents

The algorithm

Given n, ECOH divides the message M into n blocks M_0,\ldots,M_{n-1}. If the last block is incomplete, it is padded with single 1 and then appropriate number of 0. Let furthermore P be a function that maps a message block and an integer to an elliptic curve point. Then using the mapping P, each block is transformed to an elliptic curve point P_i, and these points are added together with two more points. One additional point X_1 contains the padding and depends only on the message length. The second additional point X_2 depends on the message length and the XOR of all message blocks. The result is truncated to get the hash H.

\begin{align}
P_i &{}:= P(M_i,i)\\
X_1 &{}:= P'(n) \\
X_2 &{}:= P^*(M_i, n)\\
Q &{}:= \sum_{i=0}^{n-1} P_i %2B X_1 %2B X_2\\
R &{}:= f(Q)
\end{align}

The two extra points are computed by P' and P^* . Q adds all the elliptic curve points and the two extra points together. Finally, the result is passed through an output transformation function f to get the hash result R. To read more about this algorithm, see "ECOH: the Elliptic Curve Only Hash".

Examples

Four ECOH algorithms were proposed, ECOH-224, ECOH-256, ECOH-384 and ECOH-512. The number represents the size of the message digest. They differ in the length of parameters, block size and in the used elliptic curve. The first two uses the elliptic curve B-283:  X^{283} %2B X^{12} %2B X^7 %2B X^5 %2B 1 , with parameters (128, 64, 64). ECOH-384 uses the curve B-409:  X^{409} %2B X^{87} %2B 1 , with parameters (192, 64, 64). ECOH-512 uses the curve B-571:  X^{571} %2B X^{10} %2B X^5 %2B X^2 %2B 1 , with parameters (256, 128, 128). It can hash messages of bit length up to  2^{128} .

Properties

Security of ECOH

The ECOH hash functions are based on concrete mathematical functions. They were designed such that the problem of finding collisions should be reducible to a known and hard mathematical problem (the subset sum problem). It means that if one could find collisions, one would be able to solve the underlying mathematical problem which is assumed to be hard and unsolvable in polynomial time. Functions with these properties are known provably secure and are quite unique among the rest of hash functions. Nevertheless second pre-image (and thus a collision) was later found because the assumptions given in the proof were too strong.

Semaev Summation Polynomial

One way of finding collisions or second pre-images is solving Semaev Summation Polynomials. For a given elliptic curve E, there exists polynomials  f_n that are symmetric in n variables and that vanish exactly when evaluated at abscissae of points whose sum is 0 in E. So far, an efficient algorithm to solve this problem does not exist and it assumed assumed to be hard (but not proven to be NP-hard).

More formally: Let \mathbf{F} be a finite field, E be an elliptic curve with Weierstrass equation having coefficients in \mathbf{F} and O be the point of infinity. It is known that there exists a multivariable polynomial f_n(X_1,\ldots,X_N) if and only if there exist < y_1,\ldots,y_n such that (x_1,y_1)%2B\ldots%2B(x_n,y_n) = O. This polynomial has degree 2^{n-2} in each variable. The problem is to find this polynomial.

Provable security discussion

The problem of finding collisions in ECOH is similar to the subset sum problem. Solving a subset sum problem is almost as hard as the discrete logarithm problem. It is generally assumed that this is not doable in polynomial time. However a significantly loose heuristic must be assumed, more specifically, one of the involved parameters in the computation is not necessarily random but has a particular structure. If one adopts this loose heuristic, then finding an internal ECOH collision may be viewed as an instance of the subset sum problem.

A second pre-image attack exists in the form of generalized birthday attack.

Second pre-image attack

Description of the attack: This is a Wagner’s Generalized Birthday Attack. It requires 2143 time for ECOH-224 and ECOH-256, 2206 time for ECOH-384, and 2287 time for ECOH-512. The attack sets the checksum block to a fixed value and uses a collision search on the elliptic curve points. For this attack we have a message M and try to find a M' that hashes to the same message. We first split the message length into six blocks. So  M'= (M_1,M_2,M_3,M_4,M_5,M_6). Let K be a natural number. We choose K different numbers for (M_0,M_1) and define M_2 by  M_2�:= M_0 %2B M_1 . We compute the K corresponding elliptic curve points P(M_0,0) %2B P(M_1,1) %2B P(M_2,2) and store them in a list. We then choose K different random values for (M_3,M_4), define  M_5�:= M_3 %2B M_4 , we compute  Q - X_1 - X_2 - P(M_3,3) - P(M_4,4) - P(M_5, 5), and store them in a second list. Note that the target Q is known. X_1 only depends on the length of the message which we have fixed. X_2 depends on the length and the XOR of all message blocks, but we choose the message blocks such that this is always zero. Thus, X_2 is fixed for all our tries.

If K is larger than the square root of the number of points on the elliptic curve then we expect one collision between the two lists. This gives us a message (M_1,M_2,M_3,M_4,M_5,M_6) with: 
Q = \sum_{i=0}^5 P(M_i,i) %2B X_1 %2B X_2
This means that this message leads to the target value Q and thus to a second preimage, which was the question. The workload we have to do here is two times K partial hash computations. For more info, see "A Second Pre-image Attack Against Elliptic Curve Only Hash (ECOH)".

Actual parameters:

ECOH2

The official comments on ECOH included a proposal called ECOH2 that doubles the elliptic curve size in an effort to stop the Halcrow-Ferguson second preimage attack with a prediction of improved or similar performance.

References

See also

Provably secure cryptographic hash function